Agent Communication Models
This document provides comprehensive data model documentation for agent communication schemas, focusing on:
The GenerateScriptRequest model used for browser automation requests, including goal specification, target URL handling, DOM structure representation, and constraint definitions
The corresponding response model and validation rules
The agent message payload structure, conversation context management, and state preservation mechanisms
Field definitions, optional parameter handling, and data type specifications
Examples of request/response cycles, error handling patterns, and validation scenarios
The relationship between agent models and the reactive agent system architecture
The agent communication models span three primary layers:
Request/response models: Strongly typed Pydantic models defining the shape of incoming/outgoing data
Routers: FastAPI endpoints that validate inputs and orchestrate service calls
Services: Business logic that interacts with LLMs, sanitizers, and external systems
Builds payloads and sends HTTP requests"] MAP["agent-map.ts
Defines agent endpoints"] end subgraph "API Layer" RT["routers/browser_use.py
POST /api/agent/generate-script"] RR["routers/react_agent.py
POST /api/genai/react"] end subgraph "Models" REQ1["models/requests/agent.py
GenerateScriptRequest"] RES1["models/response/agent.py
GenerateScriptResponse"] REQ2["models/requests/react_agent.py
ReactAgentRequest"] RES2["models/response/react_agent.py
ReactAgentResponse"] end subgraph "Services" SVC1["services/browser_use_service.py
AgentService.generate_script"] SVC2["services/react_agent_service.py
ReactAgentService.generate_answer"] end subgraph "Agents" AG["agents/react_agent.py
GraphBuilder, AgentState, conversion helpers"] end EX --> MAP EX --> RT EX --> RR RT --> REQ1 RT --> RES1 RR --> REQ2 RR --> RES2 RT --> SVC1 RR --> SVC2 SVC1 --> AG SVC2 --> AG
Diagram sources
Section sources
This section documents the two primary agent communication schemas and their relationships.
GenerateScriptRequest Model#
Purpose: Defines the input schema for generating a browser automation action plan from a natural language goal.
Fields:
goal: Required string describing the automation task
target_url: Optional string; defaults to empty string if omitted
dom_structure: Optional dictionary; defaults to empty dictionary if omitted
constraints: Optional dictionary; defaults to empty dictionary if omitted
Validation and behavior:
Goal is mandatory; router rejects requests without it
DOM structure and constraints are optional and used to enrich the LLM prompt
Router forwards validated fields to the service
Data type specifications:
goal: string
target_url: string | null
dom_structure: dict[str, Any] | null
constraints: dict[str, Any] | null
Optional parameter handling:
Empty string fallback for target_url
Empty dict fallback for dom_structure and constraints
Section sources
GenerateScriptResponse Model#
Purpose: Defines the standardized response for automation plan generation.
Fields:
ok: Boolean flag indicating success or failure
action_plan: Optional dictionary containing the generated JSON action plan
error: Optional string describing the error on failure
problems: Optional list of validation problem strings
raw_response: Optional string containing the raw LLM output for inspection
Validation and behavior:
On success: ok is true and action_plan is populated
On validation failure: ok is false, problems is set, error describes the issue
On general failure: ok is false, error contains the error message
Section sources
ReactAgentRequest and ReactAgentResponse Models#
Purpose: Define the input and output schemas for the reactive agent system that handles general conversational tasks with optional tool use.
ReactAgentRequest fields:
messages: Required list of AgentMessage entries; minimum length 1
google_access_token: Optional string; supports multiple aliases for tolerance
pyjiit_login_response: Optional nested PyjiitLoginResponse object
AgentMessage fields:
role: Literal role among “system”, “user”, “assistant”, “tool”
content: Required string with minimum length 1
name: Optional string
tool_call_id: Optional string; alias supported
tool_calls: Optional list of tool call dictionaries
ReactAgentResponse fields:
messages: Final conversation state including the agent reply
output: Content of the latest assistant message
Validation and behavior:
Messages list must not be empty
Role must be one of the allowed literals
Tool calls are preserved when present
Response mirrors the final state of the conversation
Section sources
The agent communication architecture integrates extension-driven payload construction, API validation, service orchestration, and agent/graph execution.
executeAgent.ts" participant Map as "Agent Map
agent-map.ts" participant API as "FastAPI Router
routers/browser_use.py" participant Svc as "Service
services/browser_use_service.py" participant Prompt as "Prompt Template
prompts/browser_use.py" participant LLM as "LLM" participant San as "Sanitizer
utils/agent_sanitizer.py" Ext->>Map : Resolve endpoint "/api/agent/generate-script" Ext->>Ext : Build GenerateScriptRequest payload Ext->>API : POST /api/agent/generate-script API->>Svc : generate_script(goal, target_url, dom_structure, constraints) Svc->>Prompt : Compose prompt with DOM info Svc->>LLM : Invoke chain with prompt LLM-->>Svc : Raw response text Svc->>San : sanitize_json_actions(response_text) San-->>Svc : (action_plan, problems) Svc-->>API : Result {ok, action_plan, error, problems} API-->>Ext : GenerateScriptResponse
Diagram sources
GenerateScriptRequest/Response Workflow#
This workflow demonstrates the end-to-end cycle for generating a browser automation action plan.
Diagram sources
Section sources
React Agent Message Payload and Conversation State#
The reactive agent system manages conversation context and preserves state across turns.
Diagram sources
Section sources
DOM Structure Representation and Target URL Handling#
The extension captures DOM information and constructs the GenerateScriptRequest payload.
url, title, interactive elements"] Extract --> BuildReq["Build GenerateScriptRequest:
goal, target_url, dom_structure, constraints"] BuildReq --> Send["Send POST /api/agent/generate-script"] Send --> End(["Receive GenerateScriptResponse"])
Diagram sources
Section sources
Validation Rules and Error Handling Patterns#
Validation spans multiple layers:
Router-level validation ensures required fields are present
Service-level prompt composition and LLM invocation
Sanitizer validates JSON structure and action semantics
Error responses maintain a consistent shape
Diagram sources
Section sources
The following diagram shows key dependencies between models, routers, services, and utilities.
Diagram sources
Section sources
DOM structure truncation: Interactive elements are limited to avoid excessive payload sizes
Prompt token limits: DOM summaries cap the number of interactive elements included
Caching: The reactive agent graph is cached to reduce compilation overhead
Optional fields: Using optional fields reduces unnecessary data transfer and processing
Common issues and resolutions:
Missing goal: Router returns HTTP 400 with a descriptive message
Validation failures: Service returns ok=false with problems list and raw_response for debugging
General errors: Service returns ok=false with error message
React agent errors: Service logs and returns a generic apology message
Section sources
The agent communication models provide a robust, typed interface for both browser automation and conversational AI tasks. The GenerateScriptRequest model enables precise automation planning by incorporating DOM context and constraints, while the ReactAgentRequest/Response models support rich conversational exchanges with optional tool use. Validation and error handling are consistently applied across layers to ensure predictable behavior and clear feedback.